Managing Structured Collections of Community Data
نویسندگان
چکیده
Data management is becoming increasingly social. We observe a new form of information in such collaborative scenarios, where users contribute and reuse information, which resides neither in the base data nor in the schema information. This “superimposed structure” derives partly from interaction within the community, and partly from the recombination of existing data. We argue that this triad of data, schema, and higher-order structure requires new data abstractions that – at the same time – must efficiently scale to very large community databases. In addition, data generated by the community exposes four characteristics that make scalability especially difficult: (i) inconsistency, as different users or applications have or require partially overlapping and contradicting views; (ii) non-monotonicity, as new information may be able to revoke previous information already built upon; (iii) uncertainty, as both user intent and rankings are generally uncertain; and (iv) provenance, as content contributors want to track their data, and “content re-users” evaluate their trust. We show promising scalable solutions to two of these problems, and illustrate the general data management challenges with a seemingly simple example from community e-learning (“ce-learning”). 1. A VISION: MASSIVE COMMUNITY E-LEARNING WITH PAIRSPACE We will argue that management of collections of community data requires a new abstraction that does not fit well in the common dichotomy of data and schema information. We illustrate this idea with the vision of a massive online question-answer learning community of users, grouped around a hypothetical tool we refer to as PairSpace. We prefer to keep the overall setup simple. This is a concrete community data management scenario that illustrates the main issues in this paper, while at the same time, seems to have a simple relational implementation. Note that the underlying challenges naturally extend to more complex and general community content management scenarios. 5 Biennial Conference on Innovative Data Systems Research (CIDR ’11) January 9-12, 2011, Asilomar, California, USA. This article is published under a Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits distribution and reproduction in any medium as well allowing derivative works, provided that you attribute the original work to the author(s) and CIDR 2011. 1 day 3 days 1 week 1 month 6 months correct
منابع مشابه
Elderly Community Dwelling Women\'s Experiences of Managing Strategies for Urinary Incontinence: A Qualitative Research
Introduction: Urinary incontinence (UI) is high prevalent in older women. Little is known about how they manage with this chronic condition from their points of view. The aim of this study was to explore older women’s experiences of management strategies in dealing with UI. Methods: Eight community dwelling women aged 60 and over, with long term UI participated in this qualitative st...
متن کاملManaging XML documents in object-relational databases
XML becomes the standard for the representation of structured and semi-structured data on the Web. Relational and object-relational database systems are a well understood technique for managing and querying such large sets of structured data. In our approach, the nested-relations data model is the basic model for representing XML data in object-relational database systems. Using the partitioned...
متن کاملCommunity-Contributed Media Collections: Knowledge at Our Fingertips
The widespread popularity of the Web has supported collaborative efforts to build large collections of community-contributed media. For example, social video-sharing communities like YouTube are incorporating ever-increasing amounts of user-contributed media, or photo-sharing communities like Flickr are managing a huge photographic database at a large scale. The variegated abundance of multimod...
متن کاملMedici: A Scalable Multimedia Environment for Research
Luigi Marini, Rob Kooper, Joe Futrelle, Joel Plutchak, Alan Craig, Terry McLaren and James Myers National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign Large-scale community collections of images, videos, and other media are a critical resource in many areas of research and education including the physical sciences, biology, medicine, humanities, arts, and ...
متن کاملManaging Very Large Scienti c Data
We discuss issues in managing very large scientiic data collections and describe our approach at the San Diego Supercomputer Center for supporting high performance data-intensive applications. Our systems provide metadata-based access to data sets and support collections with widely varying data characteristics.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011